Multiple species delimitation approaches with COI barcodes poorly fit each other and morphospecies – An integrative taxonomy case of Sri Lankan Sericini chafers (Coleoptera: Scarabaeidae)

Abstract DNA taxonomy including barcoding and metabarcoding is widely used to explore the diversity in biodiversity hotspots. In most of these hotspot areas, chafers are represented by a multitude of species, which are well defined by the complex shape of male genitalia. Here, we explore how well COI barcode data reflect morphological species entities and thus their usability for accelerated species inventorization. We conducted dedicated field surveys in Sri Lanka to collect the species‐rich and highly endemic Sericini chafers (Coleoptera: Scarabaeidae). Congruence among results of a series of protocols for de novo species delimitation and with morphology‐based species identifications was investigated. Different delimitation methods, such as the Poisson tree processes (PTP) model, Statistical Parsimony Analysis (TCS), Automatic Barcode Gap Discovery (ABGD), Assemble Species by Automatic Partitioning (ASAP), and Barcode Index Number (BIN) assignments, resulted in different numbers of molecular operational taxonomic units (MOTUs). All methods showed both over‐splitting and lumping of morphologically identified species. Only 18 of the observed 45 morphospecies perfectly matched MOTUs from all methods. The congruence of delimitation between MOTUs and morphospecies expressed by the match ratio was low, ranging from 0.57 to 0.67. TCS and multirate PTP (mPTP) showed the highest match ratio, while (BIN) assignment resulted in the lowest match ratio and most splitting events. mPTP lumped more species than any other method. Principal coordinate analysis (PCoA) on a match ratio‐based distance matrix revealed incongruent outcomes of multiple DNA delimitation methods, although applied to the same data. Our results confirm that COI barcode data alone are unlikely to correctly delimit all species, in particular, when using only a single delimitation approach. We encourage the integration of various approaches and data, particularly morphology, to validate species boundaries.


| INTRODUC TI ON
Many regions on Earth that are exceptionally rich in endemic species are facing massive habitat loss (Costello et al., 2013). Most of those areas have been identified as "biodiversity hotspots" (Myers et al., 2000). In order to be able to conserve the vast diversity of currently largely unknown species, one necessity is to recognize them (Costello et al., 2013;Modica et al., 2014;Smith et al., 2005).
However, DNA barcoding also has been critically discussed since its first emergence due to many problems coming particularly from the nature of the used single mtDNA marker gene (Ballard & Whitlock, 2004;Eberle et al., 2020;Krishnamurthy & Francis, 2012).
Many empirical studies investigated the robustness of DNA barcoding and the used species delimitation methods, particularly in the context of inherent natural bias of species such as fluctuating effective population size or unbalanced representation of specimen samples Esselstyn et al., 2012;Fujisawa & Barraclough, 2013).
While results of COI barcoding have been so far mainly compared with morphospecies entities, the congruence of the outcome of different DNA-based species delimitations has only rarely been analyzed in detail. Outcomes have often been characterized as "different" without quantifying the difference Dalstein et al., 2019;Lukic et al., 2021). These differences are explored here in detail, exemplified by a case study of Sri Lankan Sericini chafers.
Sri Lanka, together with Southern Indian Western Ghats, is one of the world's outstanding biodiversity hotspots, harboring unique and threatened biota (Myers et al., 2000). So far, only a handful of larger barcoding studies have been conducted on the Indian subcontinent.
For the highly diverse beetles, apart from a few isolated studies that were very limited in taxon sampling (Asha & Sinu, 2020;Dangalle et al., 2014), DNA taxonomy approaches including barcoding have not yet been applied in the Western Ghats hotspot. This is even true for herbivore scarab chafers, of which some species appear to be serious crop pests despite being highly endemic (Ahrens, 2004;. In the last decade, dozens of new herbivore scarab species have been discovered from the subcontinent, and Asia in general Ahrens et al., 2014aAhrens et al., , 2014bAhrens et al., , 2014cFabrizi & Ahrens, 2014;Liu et al., 2014aLiu et al., , 2014bLiu et al., , 2014cLiu et al., , 2015Liu et al., , 2016. Given the great use of COI barcode data for biodiversity assessments (Arribas et al., 2020(Arribas et al., , 2021, we were interested in expanding the existing punctual assessments of DNA barcoding Dalstein et al., 2019;Lukic et al., 2021) and in exploring how well COI barcode data reflect species entities in a understudied tropical hotspot. We chose Sericini chafer beetles (Coleoptera: Scarabaeidae: Melolonthinae) as example study group because species can be well defined by strongly differentiated male genitalia (Dalstein et al., 2019;Eberle et al., 2016). We performed dedicated field surveys in Sri Lanka and investigated the match of morphospecies with the entities inferred by commonly used species barcoding, integrative taxonomy, taxonomic match ratio

T A X O N O M Y C L A S S I F I C A T I O N
Entomology delimitation algorithms based on the sequenced COI barcode data.
Such focused tests continue to be necessary to further develop our understanding of frequently employed taxonomic markers in different organism groups, particularly in the light of potential drawbacks for accuracy of newly emerging approaches such as metabarcoding or "exclusively COI barcode-based species definitions" Sharkey, Janzen, et al., 2021).
These sites included different forest types in different ecozones.
Beetles were captured using ultraviolet-light traps and manual collecting from a white sheet being illuminated with ultraviolet, blue, and green LEDs (LepiLED, © WIF, Dr Gunnar Brehm, Jena, Germany).
Some additional specimens were hand collected during the day. All specimens were preserved in 96% ethanol after collecting.
The collected specimens were presorted to morphospecies.
For this purpose, all male genitalia were dissected and labeled.
Identification to species level was done using recent literature Fabrizi & Ahrens, 2014;Ranasinghe et al., 2020) and, in some cases, by comparison with type specimens. Three to seven male individuals of each morphospecies per location were selected for DNA extraction and subsequent sequencing (in total 280 individuals). The species' habitus and male genitalia were photographed of one selected specimen per species, using a Zeiss AxioCam HRc camera (SteREO Discovery. V20). Images at several focal points were taken using the Zeiss Axio Vision (ZEN pro) software package and stacked with Zerene Stacker (Version 1.04) (http://www.zeren esyst ems.com).  Table S1).

| Phylogenetic analysis
Maximum likelihood (ML; Felsenstein, 1973) searches were performed in IQ-TREE version 1.6.12 (Nguyen et al., 2015) under the (GTR+F+I+G4) model of nucleotide substitution that was inferred as the best-fit model by ModelFinder (Kalyaanamoorthy et al., 2017). A total of 1000 ultrafast bootstrap (Hoang et al., 2018) replicates were done to assess branch supports. The tree search was repeated 10 times with the above parameters and the tree with highest likelihood was selected for further analysis. The resulting tree was rooted with
Poisson tree process modeling was performed with PTP web server (https://speci es.h-its.org/; accessed on February 9, 2021) using the maximum likelihood implementation (hereafter mlPTP; Zhang et al., 2013) with a single Poisson distribution, as well as the Bayesian implementation (bPTP), which adds Bayesian support (pp) values for putative species to branches in the input tree. The PTP method infers speciation events based on a shift in the number of substitutions between internal nodes (Zhang et al., 2013).
Furthermore, multirate PTP (https://mptp.h-its.org/#/tree; accessed on July 23, 2021) was performed. Multirate PTP (hereafter mPTP; Kapli et al., 2017) is an improved method of PTP which does not require user-defined parameter as input and using MCMC it computes the support values for each clade, which can be used to assess the confidence of the ML delimitation. The IQ-TREE result from previous phylogenetic analysis was used as input for all PTP analyses.
Statistical parsimony analysis was performed as implemented in TCS v.1.21 (Clement et al., 2000). The procedure partitions the sequence data into clusters, i.e., subgroups (or networks) of closely related haplotypes connected by changes with a <95% probability to be non-homoplastic. Resulting networks have been found to be largely congruent with morphospecies at the 95% threshold (Ahrens et al., 2007;Meier et al., 2006) and are considered here as molecular operational taxonomic units (MOTUs).
Automatic Barcode Gap Discovery (ABGD) was conducted using the ABGD webserver (https://bioin fo.mnhn.fr/abi/publi c/abgd/ abgdw eb.html; accessed on February 9, 2021) with default parameters (i.e., using Jukes-Cantor model (JC69) distances, a relative gap width of 1 and 50 steps, Pmin = 0.001, Pmax = 0.1, and Nb bins for distance distribution = 20) (Puillandre et al., 2012). ABGD applies a set of prior intraspecific divergences to detect the position of the barcode gap, which are iteratively refined. Alternatively, we reran the ABGD analysis with a distance matrix generated through IQ-TREE analysis as the input file. This maximum likelihood distance values (mldist file) reflected pairwise distances corrected by the GTR model.
Assemble Species by Automatic Partitioning (ASAP) was conducted using the ASAP webserver (https://bioin fo.mnhn.fr/abi/ publi c/asap/ accessed on July 23, 2021) using the distance matrix generated through IQ-TREE analysis (Puillandre et al., 2020). ASAP divides species partitions based on pairwise genetic distances. ASAP also computes a probability of panmixia (p-val), a relative gap width metric (W), and ranked results by the ASAP score: the lower the score, the better the partitioning (Puillandre et al., 2020). The accuracy of DNA-based methods with prior morphospecies assignment was assessed by the match ratio : where N match is the number of exact matches of morphospecies (all individuals) with MOTUs of different delimitation methods, N mol is the number of MOTUs that resulted from different delimitation methods, and N morph is the number of morphospecies ( Table 1). All morphospecies were mapped onto the terminals of the maximum likelihood tree along with the images of their male genitalia (lateral view) and MOTUs obtained from different species delimitation methods ( Figure 2). Furthermore, the match ratios for all pairs of delimitation methods were calculated and compared in a similarity matrix. The matrix was transformed into a distance matrix and a principal coordinate analysis (PCoA) was performed in PAST v.3.25 (Hammer et al., 2001) in order to compare similarity between different methods.

| Data handling
All raw data produced in the project are freely accessible and deposited in dedicated databases for secure and curated long-term storage. Extracted DNA is stored at −80°C in the DNA bank collection of the ZFMK (https://www.zfmk.de/en/biobank). Obtained nucleotide sequences were submitted to NCBI (accession numbers MW698204-MW698469) and BOLD databases (project: SCOIB) (see Table S1). Voucher specimens are deposited in the insect collections of the ZFMK, each with a unique voucher ID that is linked to the corresponding DNA extract from that sample to the DNA sequences, and to the morphological data and images. Voucher-related data and IDs were stored in the ZFMK's digital collection database.

| RE SULTS
Morphological sorting of captured scarab specimens (total ca. 2300) resulted in a total of 45 Sericini morphospecies of which 280 individuals were selected for sequencing. These species included 41 Sri Lankan endemics and representatives from all five Sericini genera occurring in Sri Lanka. Thirty-four morphospecies were represented by more than one individual and 11 were singletons. For 266 specimens, we successfully obtained COI sequences (658 bp). From 266 individuals, 257 were assigned to putative morphospecies using male genital preparations. Nine specimens were females which did not have suitable diagnostic characters for morphospecies assignment; they were subsequently assigned to species using the DNA sequences as they were clearly nested in the relevant species clades (seven specimens: SR0088, SR0100, SR0118, SR0186, SR0190, SR0350, and SR0881) or unambiguously assigned to a species in all delimitation methods (two specimens: SR0089 and SR0095).
Despite our repeated extensive sampling (Figure 1), 75.5% of the taxa (34 morphospecies) were collected from only single localities.
For example, nine species (20% of all recorded species) were only found at Deanston (L4), 10 species (22.2%) only at Dambulla (L3), and 6 species (13.3%) only at Aranayake (L1). However, 11 morphospecies (24.5% of all recorded species) were represented in more than one locality, but none was found in more than half of all localities: Maladera fistulosa clade. The latter is a diverse, endemic radiation on the island that is characterized by entirely reduced parameres. It included eight so far new, unnamed species, which will be described in a separate publication.

| DISCUSS ION
This study focused on the investigation of the performance of COI barcode data with various species delimitation approaches since COI barcodes are widely used as a proxy for species taxonomy and for ecological monitoring. Specifically, we investigated how well the resulting MOTUs reflected species entities in a megadiverse chafer group. While being rather uniform in external appearance, Sericini chafers show extremely well-differentiated genitalia even between closely related species Dalstein et al., 2019). The correlation between divergent genital morphology and evolutionary entities was widely confirmed by integrative taxonomy studies (Ahrens & Ribera, 2009;Eberle et al., 2016), including even genomic data (Dietz et al., 2021).  Dalstein et al., 2019;Lukic et al., 2021). Some of them were subject to strong variation (between 0.14 and 1) depending on the different selected subclades being analyzed . Even in the absence of geographic sampling bias, these have shown low match ratios (0.59-0.77) (Lukic et al., 2021). Also, the congruence between the different delimitation methods, although using the same data, was moderate. This is in line with graphical summaries of many DNA taxonomy studies that have shown rather inconsistent results among different species delimitation approaches using the same marker (see above; Bergsten et al., 2012;Magoga et al., 2021).
However, some studies, particularly those with limited geographical (i.e., regional) scope in the northern hemisphere, showed almost perfect matches of MOTUs with morphospecies among nearly 90% of the studied species (Hendrich et al., 2015;Pentinsaari et al., 2014;Rulik et al., 2017). So far uninvestigated was their mutual multidimensional relations in terms of match ratios (Figure 3) . The observed divergent clusters in the plot of mutual match similarity, also in context with morphospecies, provided insight to the robustness and confidence of the results in a range of the used species delimitation approaches. Even using the same genetic marker, results differed conspicuously and call for caution regarding premature conclusions . Inconsistency between MOTUs and morphospecies could have been caused by highly rapid speciation: Maladera cervicornis and M.

Inconsistency between COI
haniel, which here both resulted as monophyletic sister taxa, have highly distinct male genitalia. They lumped in all methods except TCS. This could be indicative that divergence of their male genitalia is much faster and more distinct than mitochondrial molecular divergence and lineage sorting, which, although being complete, was not sufficient in terms of degree of divergence to delimit species unambiguously. Similar evidence for multiple species have been shown by Eberle et al. (2016) and confirmed with genomic data by Dietz et al. (2021).
Over-splitting of morphospecies encountered here was apparently also caused by relatively deep coalescence, for example, by considerable geographically determined genetic variation in haplotypes (Sel. impexa). Specimens of this species originated from two isolated lowland forest reserves without any heterogeneous landscapes in between (L8 and L10; see Figure 1), so that individuals might not be able to migrate between these populations. Splitting pusilla, and S. lurida were recorded from different forests in the central highlands with complex elevation patterns. A greater biodiversity is observed in these forests compared to lowland forests (Meegaskumbura et al., 2015), since they are separated by steep escarpments, gorges, parallel ridges, or peaks (Cooray, 1967), which may act as geographical barriers for dispersal and may result in partial reproductive isolation. However, spatial separation of individuals did not always cause over-splitting. Morphospecies of M. dubia, M.
hortonensis, M. rufocuprea, and Serica fusa were collected from different geographical locations and still appeared as a single entity in all analyses.
In general, as a result of limited dispersal, a negative relationship is expected for the geographic distance and the mating probability of individuals, as predicted by isolation by distance (Perez et al., 2018).
Heterogeneous landscapes additionally might affect levels of gene flow due to reduced dispersal in consequence of the patchiness of preferred habitats (Perez et al., 2018;van Strien et al., 2015). Thus, it is obvious that low dispersal propensity contributes to an elevated but unknown extent of intraspecific, geographically structured divergence (Li et al., 2015). Alternatively, genetic divergence may also result from ecological adaptation or sexual selection (Boughman, 2001;Matsubayashi et al., 2013). What is actually more likely for each case under study is often unknown, as is the case of the Sri Lankan Sericini chafers studied here. Biased accumulation of mutations in mtDNA after population separation of widespread species can obscure the limits between putative species .  . Moreover, the effect of geography-induced genetic divergence depends on the latitude as well as the involvement of diversity hotspots (Gaytán et al., 2020), which generally are also refugial areas (Ahrens et al., 2013), characterized by long-term climatic stability (Fjeldså et al., 1999;Fjeldsaå et al., 1997;Harrison & Noss, 2017). All this would affect the output of different species delimitation methods and in fact, none of the used methods report accurate species numbers compared to prior morphospecies assignments.
Besides the above discussed issues like incomplete lineage sorting and introgression (Ballard & Whitlock, 2004;Funk & Omland, 2003), molecular species delimitation approaches based on information from a single gene such as the mitochondrial gene COI are frequently hampered by pseudogene co-amplification or Wolbachia infections (Smith et al., 2012;Song et al., 2008), which may bias haplotype distributions. Furthermore, sampling size influences the results of delimitation methods Luo et al., 2018).
Low number of samples may affect species delimitation. Our sample size ranged from 5 to 10 individuals per species and sampling was the same for all delimitation methods, and therefore did not affect their comparison. The estimation of a tree topology also affects delimitation methods; relying on a single mitochondrial DNA marker system is prone to errors that can mislead species delimitation and identification (see Eberle et al., 2020).
The present study indicated that not the study organisms, i.e., the data itself, is the sole cause for incongruent species entities that were proposed by different methods. If signals were inherent to the data that caused mismatches with morphospecies, the same pattern of over-splitting and lumping would be expected across all methods.
This was not always the case (Figure 2). We conclude that in these cases the used single marker system provided insufficient or misleading signal for accurate delimitation of species.
In order to bypass some of these difficulties of incongruence of morphospecies and species identification or delimitation with COI data, there have been proposals for a haplotype-based macroecology, as patterns of intraspecific genetic diversity were found to be correlated with species richness (Papadopoulou et al., 2011) even at different spatial scales (Baselga et al., 2015). This way, highly valuable and easily produced data from, e.g., metabarcoding can be used (Taberlet et al., 2012). This becomes especially relevant in absence of complete reference barcode libraries in order to avoid high amounts of data deletion due to impossible species assignments.
Our results confirm that COI DNA barcode data alone are inadequate to delimit species, in particular in this case of Sericini chafers. Using various levels of haplotype diversity for ecological or evolutionary assessments bear high levels of uncertainty, as they might reflect different patterns at variable scales in time and space.
However, these patterns appear to be not entirely and stringently evolutionary significant, as are those ones reflected by species. Thus, although haplotype diversity and species diversity are correlated, any ecological approach that does not take (true) species entities and species diversity in account might look at different ecological interrelationships and processes than those considering true species (Papadopoulou et al., 2011;Thormann et al., 2016). Also, our results strongly discourage approaches of a minimalist taxonomic procedure, defining (new) species based on COI barcode data alone, using a single species delimitation approach only without morphological reference diagnoses (Meierotto et al., 2019;Sharkey, Janzen, et al., 2021) (see also Ahrens et al., 2021;Fernandez-Triana, 2022;Meier et al., 2022;Zamani et al., 2021).
Due to the severe impact of human activities including climate change, numerous species risk going extinct before being discovered (Costello et al., 2013). An estimated 10 million species remain to be discovered (Dayrat, 2004). A stable and robust nomenclature is the basis of clear communication and scientific discussion about biodiversity. Including true species entities within biodiversity research incorporates evolutionary scales and processes at all time levels.
In this manner, species entities and names provide the "anchor" to which all taxonomic, ecological, molecular, and conservation data are attached (International Trust for Zoological Nomenclature, 2008).
Legal protection and policy are also linked to names (i.e., species), not to actual (mortal) individuals (or haplotypes), on the assumption that the groups indicated by the names are consistent through time and among places.
However, issues of sampling and the inherent nature of species (e.g., the fluctuation of effective population size;  are variables that will always impact large-scale approaches and require continued integration with other sources of evidence (Padial et al., 2010;Schlick-Steiner et al., 2010), thereby specifically accounting for characteristics of every single species (e.g., Campillo et al., 2020;Dufresnes et al., 2020;Hausdorf & Hennig, 2020).

CO N FLI C T O F I NTE R E S T
We have no conflicts of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
DNA sequences: GenBank accessions (MW698204-MW698469); BOLD databases (project: SCOIB).